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j^^ ■ Abstract 

We consider the problem of covering a graph with a given number of induced sub- 
graphs so that the maximum number of vertices in each subgraph is minimized. We 
prove NP-completeness of the problem, prove lower bounds, and give approximation 
algorithms for certain graph classes. 

O 

Let G = {V, E) be a graph. The order of G is the number \V\ of its vertices. For an 

^ ■ arbitrary subset of vertices V C V, the induced subgraph denoted by G[V'] is the subgraph 

^ ! of G with vertex set V and all edges e G £" such that both endpoints of e belong to V. In 

other words, G[V'] = {V, E') where E' = {(u^v) (z E : u^v E V'}. The union of two graphs 
O ; GiiVi.Ei) and G2(V2,-E2) is the graph G = {yi\JV2, Ei\J E2) ■ We say that a graph if covers 

(^ ' a graph G if and only if G is a subgraph of H . 

O ■ ^^ """''"■ '"' '°"°"'"« opt,m,zat,ou problem: giveu a graph G = (V, E) a,rd an integer 

Q \ that the maximum order of the induced subgraphs is minimized. Thus, for every edge (m, v) 

of G we require that there exists an i in the range 1 < i < k such that both u & Vi and 
w G Vi; we wish to minimize maxi<j<fc{|V^|}. We denote this problem by Cover(G, k). 

Without loss of generality, we can assume that each Vi has the same cardinality, since we 
can add extra vertices to any subset smaller than the largest without increasing the cost of 
the solution. 

Motivation Suppose we have a parallel computer with k processors, each with its own local 
memory. The local memory of each processor is bounded, and can store at most M words. 
We want to distribute n items of data, each occupying a single word in memory, among the 
processors so that they can execute a certain computation in parallel. An individual step 
in the computation requires a processor to read a set of operands from its memory, execute 
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an operation, and write back the result again to its local memory. Performing the operation 
requires that all operands be present in the local memory of the processor. 

We consider the case where the operations performed by the processor are binary, i.e., 
each operation requires reading exactly two operands. The computation is given as a graph 
G = (V, E) with n vertices where each vertex represents a data item and every edge of 
the graph represents a dependency between two data items. A processor can execute the 
operation corresponding to an edge {u, v) only if the operands corresponding to both u and v 
are in the local memory of the processor. 

We wish to minimize the maximum required size of local memory among all the processors 
so that every edge can be "solved" . This requires an assignment of data items or vertices to 
each of the k processors. The subset of vertices assigned to processor iisVi. We require that 
the induced subgraphs G[Vi\ together cover the whole graph G. Minimizing the maximum 
local memory among the processors is equivalent to minimizing the order of the largest 
induced subgraph. 

Related work Graph covering is a very well-studied problem — the online compendium of 
NP-optimization problems |CK05j . for instance, lists several NP-hard problems on partition- 



ing and covering graphs. Our problem is different from each of the problems in the list and 
from the many variants of graph covering, either because the constraints are different (for 
instance, we do not require the covering subgraphs to be connected or to be edge disjoint), 
or because the objective function is different, or both. To the best knowledge of the author, 
no results on the particular problem we study in the current abstract have been published 
yet. 

1 Complexity 

We show that deciding whether a forest can be covered optimally is NP-complete. 

The problem is clearly in NP. We will show that it is NP-hard by a reduction from 
3-Partition which is defined as follows: 

Given: A set A of 3m positive integer values ai, 02, . . ., a^m and a positive integer 5* 
such that S/4: < a^ < S/2 for all i where 1 < i < 3m and such that Yli=i ^« ~ ''^^■ 
Question: Can A be divided into m disjoint subsets Bi, B2, . . ., Bm such that ^^.g^. ai = S 
for all 1 < j < m? 

Note that because S/A < ai < S/2 for all i, any solution that answers the question in 
the affirmative must have \B,j\ =3 for all j . 

3-Partition is known to be strongly NP-complete |GJ79j . i.e., it is NP-hard even when 
all instances are encoded in unary. We demonstrate a polynomial-time reduction from an 
arbitrary instance of 3-Partition to an instance of Cover(G, k) that preserves "yes" and 
"no" answers. 

The graph G in the instance of Gover(G', k) will be a forest of 3m, disjoint paths. For 
each positive value a, in the given instance of 3-Partition, construct a path P^*) with a, 
vertices and a^ — 1 edges. Set k = m,. 



If the original instance of 3- Partition has a solution consisting of i?i, B2, . . ., B^, 
then we construct a solution to the new instance of Cover(G, /c) in which G[Vj] is the 
union of the paths P*^*) for all i such that Oj G Bj. Since the paths are disjoint, we have 
\Vj\ = X^a gb tti = S for all j. Hence, we obtain a solution to the instance of Cover(G, k) 
of cost at most S. 

Next, consider a solution to the instance of Cover(G', A;) in which maxi<j<fc \Vj\ < S; 
hence J2i<j<k l^il ^ kmaxi^j^k \Vj\ = mS. 

The graph G has X]i<i<3m ^i ~ ''^^ vertices. Consider the mS x k boolean incidence 
matrix that has a 1 entry if and only if the corresponding vertex belongs to the corresponding 
subset. The number of I's in this incidence matrix is the sum of the cardinalities of each Vj. 
Counting the number of I's row-wise, we see that the number of I's is also equal to the sum 
over every vertex v of the number of subsets that contain v. Since each vertex has positive 
degree and because every edge must be covered, each vertex must belong to at least one 
subset. Hence, it must be the case that 'Yl,i<j<k\^j\ — Si<j<3m '^« ~ '"^'^- Therefore, we 
conclude that X^Kjxfe 1^1 = ''^^- ^^ follows that |V^| = S* for all j such that 1 < j < m. 

Since G has mS vertices and X^Kjxfcl^l ~ '"^'^' ^^^h vertex must belong to at most 
one subset. Hence, each vertex must belong to exactly one subset. Since all the edges are 
covered, any two adjacent vertices must belong to the same subset. Therefore, for each path 
P'^*\ all vertices of P*^*-* must belong to exactly one Vj. Hence, we obtain a solution to the 
original instance of 3-Partition in which for all i we have Oj G Bj if and only if P^*) belong 



:s 



to Vj. This is a valid solution because for all j we have Yla gb ^i ~ 1^1 ~ ^■ 
We have thus proved the following theorem. 

Theorem 1. Cover(G, /c) is NP-hard even when G is a disjoint union of paths. 

2 Lower bounds 

In this section, we prove lower bounds on the size of an optimum solution. 

2.1 A lower bound based on connectivity 

Clearly, maxj \Vi\ > In/k]. Let X be the intersection graph of the l^'s. If G is connected, 
then X must be connected, so it must have at least k — 1 edges. Each edge of X corresponds 
to some vertex that belongs to more than one subset, so X]j=i 1^*1 > n + k — 1. Since the 
maximum of a set is at least as large as its mean, we have 



max IK'I > 



I 



n + k — 1 
P 



+ 1. (i: 



Suppose G is /t-connected. Let NiVj) denote the open neighborhood of Vf, i.e. NiVi) 
consists of all vertices in the complement of Vi that are adjacent to some vertex in Vi. 

Suppose maxj {\Vi\} < n — k. We claim that |A^(Vi)| > k. This must be true because 
otherwise removing the vertices in N{Vi) would disconnect Vi from the rest of the graph 
(note that V^l > k). Likewise, |A^(l^)| > k. 



For each subset Vi, let Wi denote Uj^l^Vj. Note that Vi CWi. 

\Vi n A^(l^)| >K^ iVif] Wi\ > K. 
Hence, at least n vertices in each subset Vi belong to more than one subset. 
Claim 2. 

i=l 

where Kj is the number of vertices of Vi that belong to more than one subset. 

Proof. The proof is by induction on the number of vertices, n. The base case n = is trivial. 
Remove a vertex v of G. Suppose v belongs to s of the subsets. Therefore, in this smaller 
graph, from the induction hypothesis, we have 



J2 \V^\ > n 



1 + ^ — - 



i=l 



Hence, in the original graph. 
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From the previous argument, we have Ej=i '^i — ^'^- Hence, 

max |Vi| > mm |n — k, — H — | (2) 

which is better than the lower bound of Equation (Q) whenever k > 2. 

2.2 A lower bound for dense graphs 

Let p(m) be an upper bound on the number of edges in an induced subgraph of G of order 
m. The function p is a measure of the density of G. Any subset of K or fewer vertices 
will cover at most p{K) edges. Since every edge in G must be covered by the k induced 
subgraphs, we must have 

max {|V^i|} > min {m : kp{m) > e{G)}. 

l<i<k 



Since p(m)< (^), 



min {m : kp{m) > e{G)} > min {m : k[ ) > e{G)}; 



m 



hence, 




max m\}>l{ i + x^:ii:^^_i ) > 1 ( 1 + ^/rnri ) > ^/rnrz (3) 

For dense graphs where e{G) = r2(n^), equation gives us a lower bound of r2(n ■ k^^'"^) 
which is better than that of equation (j21). In particular, the bound of G(n ■ k"^^"^) is tight 
when G is a clique. 

2.3 Another lower bound 

Suppose Vi, V2, . . ., Vfc is a feasible solution (not necessarily optimum) to Cover(G, A;); 
i.e., for every edge (m, v) & E there exists / such that {u, v} C VJ. Let S" C V^ be an arbitrary 
subset of vertices. Let N{S) denote the neighborhood of the set S, i.e., N{S) = {v E V\S : 
3m G S, {u,v) e E}. 

Let G(^) denote {V^ : VJ n S ^ 0, 1 < / < A;}. We claim that ^ U A^(^) C Uv;ec(5) '^'- % 
definition of ^(5*), we have S C IJyjgf;^^-) VJ. Let u E S and f G N{S) such that (u, f ) G -E. 
Any subset VJ that covers the edge (m, v) must contain both m and v and, since 14 contains u G 
S", it must be the case that Vi G C{S). 

Therefore, 

„,, |5| + |A^(5)| 
max Vi > ' ' ' , ^' 
yi6C(s)' |G(5)| 

In particular, we have shown that 

maxlV^I >max(|S| + |A^(S)|)/A;. (4) 

i<i<k scv 

The question arises: how good are the lower bounds in this section? The author suspects 
that they can be strengthened significantly, as evidenced by the following lemma. 

Lemma 3. There exists an infinite family of trees such that, for every tree T in the family 
with n vertices, every optimum cover ofT with two induced subgraphs (i.e., k = 2) must cost 
at least \n/2\ +Q{logn). 

Proof. Construct a family of trees indexed by the integers inductively as follows. Each tree 
in the family will have a vertex designated as the root. Let Tq denote a tree with a single 
vertex which is also the root. For each h > 1, the tree T^ consists of a new root vertex plus 
three copies of T^-i such that the root of T^ is adjacent to the three roots of the copies of 
Th-i. It can be easily verified that Th is a tree with {3^ — l)/2 vertices. 

For each tree Th in the above family, we apply the lower bound of Equation (^. Let V 
be the vertex set of T^. Let S (^ V he an arbitrary subset of vertices such that 15*1 = \V\ /2. 
It can be shown that |A^(>S')| > h — 1. Hence, the lemma follows by Equation (@)). D 



3 Approximation algorithms 

We turn our attention to specific graph classes and efficient algorithms to approximate the 
optimum solution. 

3.1 Covering a caterpillar exactly 

A caterpillar is a tree such that deleting all its leaves causes a single path to remain. Let 
T be a caterpillar and let V be the set of leaves of T; then, T is a caterpillar if and only if 
T \V is a single path P. 

Theorem 4. A caterpillar can be covered optimally by a greedy algorithm. 

Proof. We show that a caterpillar T can be covered optimally with exactly [n/Zc] + 1 vertices 
in the induced subgraph of maximum order. Order the vertices of T in the following manner. 
Let P be the path that remains after deleting all leaves of T. Let u be one of the two endpoints 
of P. Choose u to be the first vertex in the order, followed by all leaves of T adjacent to u in 
arbitrary order. Continue by ordering the vertices of T \ -u so that they follow in the order. 

Given the above vertex ordering, choose the prefix of the first In/k] + 1 vertices as the 
set Vi. Remove the edges of T in the induced subgraph G[Vi] and repeat the procedure on 
the remaining graph until we have subsets Vi, V2, V2, . . ., V^. The last subset V^ may contain 
fewer vertices. 

Note that no edge is covered by more than one induced subgraph. For each i in the 
range 1 < i < k — 1, the induced subgraph G[Vi] is a subtree of T with exactly In/k] + 1 
vertices; hence, G[Vi\ contains exactly [n/A;] edges. Therefore, IJi<i<A:-i^[^] covers exactly 
{k — 1) In/k] edges of T. The remaining {n — 1) — {k — 1) [n/fc] < n/k — 1 edges are easily 
seen to be covered by G[Vk] while ensuring that \Vk\ < n/k. D 

3.2 Covering graphs of bounded degree 

Construct a vertex cover C of G as follows: construct a maximal matching and include both 
endpoints of each edge in the matching. We get a vertex cover whose size is at most twice 
the minimum possible. Let \C\ = c. 

Let N[u] denote the closed neighborhood of a vertex u. (The closed neighborhood of 
u consists of u and all vertices adjacent to u. N{u) denotes the open neighborhood of u: 
N{u) = N[u] — {u}.) Then |iV[M]| < A + 1, where A is the maximum degree of G. Start with 
c subsets of vertices, each consisting of the closed neighborhood of a vertex in C. Clearly, 
every edge of G has both endpoints in some subset. 

Assume that c > k. Repeatedly merge the two smallest subsets, until after [lg(c/A;)J 
steps we have only k subsets. Each step at most doubles the size of the largest subset. 
Therefore, at the end of this process, 

max l^il < Y (A + 1). 



The time taken by this process of merging is 0(log(c/A;)) = 0(log(n/A;)). 

Since the lower bound is \n/k~\ , this algorithm gives an approximation ratio of 

<^±il£^ = £(A + l)<A + l, 

n/k n 

The total running time of the algorithm is easily seen to be linear in the size of the graph. 

3.3 Covering c- inductive graphs 

An interesting class of graphs is the class of c-inductive (also called c-degenerate) graphs. 

Definition 5. A graph G is c-inductive if every subgraph of G has maximum degree at 
most c. 

Equivalently, a graph G is c-inductive if it has a vertex u of degree at most c such that 
G\u is c-inductive; the empty graph is c-inductive by definition. 

Theorem 6. There exists an algorithm for c-inductive graphs with approximation ratio c+1. 

Proof. First, partition the n vertices of G into k equitable subsets Vi, V2, . . ., V^, each of 
cardinality either \n/k\ or \n/k'\ . 

Next, compute a c-inductive ordering of vertices as follows. Let vi be a vertex of degree 
at most c in G, let V2 be a vertex of degree at most c in G — fi, and so on. In general, Vi is 
a vertex of degree at most c in G — {fi, f2, . . ., Vi-i}\ such a vertex must exist because G is 
c-inductive. 

Now, for each subset Vi for i = 1,2, ...,A;, let ¥( = initially. Consider each vertex 
Vj G Vi in the inductive order restricted to vertices in Vi. Include in V/ all neighbors vi of 
Vj with index greater than j such that Vi G V \ (ViU V/); due to the inductive ordering, 
there are at most c neighbors of Vj with index greater than j. Thus, |l^/| < c |l^j| < c In/k] . 
Finally, the desired subsets of vertices are ViU VI for 1 <i <k. 

Suppose {vj,vi) is an edge of G with j < / in the inductive ordering. If both Vj and vi 
belong to Vi for some i, then the edge {vj, vi) is certainly covered. Otherwise, let Vj G Vi and 
vi G Vm. When Vj is encountered during the ith stage vi is included in V/ if it is not in V/ 
already. Thus, every edge is covered when the algorithm terminates. 

We have derived an upper bound of {c+ 1) |"n//c] on the cardinality of Vi U V/ for every i, 
which gives the approximation ratio of the algorithm as c -|- 1. D 



As a consequence, the above algorithm achieves in linear time a 2-approximation for 
forests (and trees), a 6-approximation for planar graphs, and a 3-approximation for outer- 
planar graphs. 



3.4 Heuristic for graph classes with separator theorems 

A separator theorem for a class of graphs ^ is a theorem of the following form |LT79j : 

There exist constants a < 1 and (3 > such that if G is any ra-vertex graph in 
Q, then the vertices of G can be partitioned into three subsets A, B, and G such 
that no edge joins a vertex in A with a vertex in B, neither A nor B contains 
more than an vertices, and G contains at most I3f{n) vertices. 

Such a subset G is said to be an (a,/?/(n))-separator of G. 

A natural recursive algorithm for covering a graph G E Q with k induced subgraphs is the 
following. Find subsets of vertices A, B, and G as above. Without loss of generality, assume 
that A has no more vertices than B. Recursively construct a cover with [k/2\ induced 
subgraphs of G[y4uC] and a cover with \k/2] induced subgraphs oi G[BUG]. The recursion 
terminates when k = 1 with the trivial solution. Since C is a separator, every edge of G 
belongs to either G[A U G] or G[B U G]; hence, we indeed obtain a cover. 

The solution obtained by the above recursive algorithm is close to optimal if f{n) = o{n), 
i.e., if every graph in the class Q has a separator of sublinear order. 

4 A dual problem 

A natural dual problem is to cover a given graph with as few induced subgraphs as possible, 
each with a fixed maximum number of vertices. Given a graph G = {V, E) and an integer 
m, cover G with the minimum number of induced subgraphs G[V^], G[V2], . . ., G[V^], such 
that \Vi\ < m for all i. Here, the problem is to minimize the number of processors, each with 
a fixed amount of local memory, to cover the given computation graph. 

The dual problem is also NP-complete by the same proof as for the primal; see Section^ 

5 Extension to covering hypergraphs 

The problem can be generalized to the case of covering a hypergraph. The computation 
can be modeled by a hypergraph Ti with vertex set [n\. Each edge of the hypergraph is a 
set of data items that are operands of any single operation and are therefore required to be 
stored together in some processor's memory. The problem is to assign the vertices of Ti to 
p subsets Vi, V2, . . ., Vp such that |Vi| < if and every edge of Ti belongs to at least one of 
the subgraphs of Ti induced by Vi, V2, . . ., V^. In other words, we need to cover Ti with p 
induced subgraphs 7i[Vi], 'H\V2\, . . ., 'H\Vp] such that the order of each subgraph is at most 
K. 

Note that, in general, a single vertex may belong to more than one subset. Unlike in 
the graph case, a single hyperedge e can be covered more than once but only if there exists 
some other hyperedge / such that e C /, and e and / are covered by different subgraphs. 
On the other hand, we can assume without loss of generality that no hyperedge is contained 
in another. 
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